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Abstract 

Background: Unstable Angina (UA) is widely accepted as a critical phase of coronary heart disease with patients 
exhibiting widely varying risks. Early risk assessment of UA is at the center of the management program, which allows 
physicians to categorize patients according to the clinical characteristics and stratification of risk and different 
prognosis. Although many prognostic models have been widely used for UA risk assessment in clinical practice, a 
number of studies have highlighted possible shortcomings. One serious drawback is that existing models lack the 
ability to deal with the intrinsic uncertainty about the variables utilized. 

Methods: In order to help physicians refine knowledge for the stratification of UA risk with respect to vagueness in 
information, this paper develops an intelligent system combining genetic algorithm and fuzzy association rule 
mining. In detail, it models the input information's vagueness through fuzzy sets, and then applies a genetic fuzzy 
system on the acquired fuzzy sets to extract the fuzzy rule set for the problem of UA risk assessment. 

Results: The proposed system is evaluated using a real data-set collected from the cardiology department of a 
Chinese hospital, which consists of 54 patient cases. 9 numerical patient features and 1 7 categorical patient features 
that appear in the data-set are selected in the experiments. The proposed system made the same decisions as the 
physician in 46 (out of a total of 54) tested cases (85.2%). 

Conclusions: By comparing the results that are obtained through the proposed system with those resulting from the 
physician's decision, it has been found that the developed model is highly reflective of reality. The proposed system 
could be used for educational purposes, and with further improvements, could assist and guide young physicians in 
their daily work. 

Keywords: Unstable angina risk assessment. Fuzzy association rule mining. Genetic algorithm 



Background 

Unstable Angina (UA) is a kind of chest discomfort or 
pain that occurs in a continuous and unpredictable way 
[1,2]. The unstable pain can result from the disruption 
of an atherosclerotic plaque in narrowed coronary vessels 
with lessened flexibility, embolization and vasospasm. As 
a major type of Cardiovascular Disease (CVD), UA lays its 
symptoms between stable angina and acute myocardium 
infarction and a further sudden death [2]. While the risk 
of UA is high, the population of UA is huge, especially 
for aged people and those with associated disease such as 
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hypertension and diabetes [3]. To this end, reliable assess- 
ment of risk levels for individual UA patients will be of 
significant value and interest. 

A number of models for UA risk assessment have been 
proposed in literature. Most of these models are derived 
from databases of clinical trials, e.g., the Thrombolysis 
in Myocardial Infarction (TIMI) [4], platelet glycopro- 
tein Ilb/IIIa in unstable angina: Receptor Suppression 
Using Integrilin (PURSUIT) [5], and the Global Registry of 
Acute Coronary Events (GRACE) [6], etc. They use stan- 
dard patient features that are part of the routine medical 
evaluation of UA patients, and lead to a score to define 
prognostic groups [2]. Although there are many benefits 
related to the design and use of these prognostic models, 
a number of studies have highlighted possible shortcom- 
ings [2,4]. One serious drawback is that existing models 
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lack the ability to deal with the intrinsic uncertainty about 
patient features utilized in UA risk assessment. Note that 
vagueness is fundamental and indispensable aspects of 
knowledge, so as in many practical problems, the experts 
face vagueness in feature vectors. According to Bellman 
and Zadeh "much of the decision making in the real 
world takes place in an environment in which the goals, 
the constraints, and consequences of possible actions are 
not known precisely" [7]. Regarding UA risk assessment, 
many patient features are vague, and not easy to be han- 
dled by existing models. It is, therefore, necessary to 
develop a new UA risk assessment model to deal with 
vague information. 

In this paper, a novel UA risk assessment model has 
been developed using fuzzy set theories. The proposed 
model represents patient features with fuzzy sets and then 
extracts useful information with a descriptive rule induc- 
tion approach based on fuzzy systems. To derive fuzzy 
rules from data, the proposed model employs genetic 
algorithms (GAs) to learn rule base from the collected 
data-set. GAs are search algorithms based on natural 
genetics that provide robust search capabilities in com- 
plex spaces [8]. The hybridization between fuzzy systems 
and GAs, called genetic fuzzy system (GFS), has attracted 
considerable attention in the computational intelligence 
community [9-12]. Our main goal is to develop a novel 
GFS such that we derive from clinical data-set a set of 
assessment rules, which has good interpretability before 
determining an efficient assessment model in order to 
get high accuracy of UA risk stratification. Since the 
accuracy of an assessment model can be largely affected 
by processing vague patient features, this paper also 
discusses a clustering-based method for patient feature 
partitioning. 

This paper is organized as follows. Section 'Prelim- 
inary' presents preliminary knowledge used in this paper. 
Section 'Method' describes the development of the 
genetic-fuzzy system for UA risk assessment. Exper- 
imental studies of the performance of the proposed 
approach are presented in Section 'Results and discus- 
sion'. Section 'Conclusion' concludes the paper. 

Preliminary 

Let D = {fTi, • • • , a^} be a patient data-set consisting of a 
finite set of UA patient cases. Let A = {ai, • • • .Un} rep- 
resent all patient features that appear in D and Class = 
{low-risky medium-risk, high-risk} be a set of UA risk lev- 
els. Each feature a may have a categorical or numerical 
underlying domain, denoted dom(a). Each patient case a 
{(7 e D) contains values of some patient features from A, 
Let a {a) (a (a) g dom{a)) be the target feature value for 
the patient case g for feature a. 

For example. Table 1 shows an example patient data-set, 
which consists of five patient cases. Each case contains 



Table 1 An example patient data-set 

Age (year) Sex Smoking Heart events Physician assessment 

recently 





74 


Male 


No 


Yes 


Medium-risk 


(J2 


81 


Female 


No 


Yes 


Medium-risk 


0^3 


74 


Male 


Yes 


Yes 


High-risk 


(74 


71 


Male 


No 


Yes 


High-risk 


0^5 


76 


Female 


No 


No 


Low-risk 


0^6 


67 


Male 


Yes 


No 


Medium-risk 



The cases are simplified information extraction from patient records of Chinese 
PLA general hospital. 

1 numerical patient features (i.e., age) and 3 categorical 
patient features (i.e., sex, smoking, and has event recently). 

For numerical patient feature a {a g A), let 
{l\, la>' ' ' ^ linguistic terms defined over a. 

Let fjia,j((^(a)) be the membership degree on the value of 
a feature a of the patient case a to the fuzzy set corre- 
sponding to the linguistic label 4 for this feature a. Note 
that the degree of membership of each value of a in any 
of the fuzzy sets specified for a is directly based on the 
evaluation of the membership function of the particular 
fuzzy set with the value of a as input. The fuzzy parti- 
tion of dom(a) is composed of {/^, • • • ,/^} that satisfies 
YlJLi l^a,jipc) = l,yx e dom{a). 

In this study, we employ fuzzy rules of the following 
form [9,12]: 

r : Cond Class (1) 

where Cond is the antecedent part of the rule, and Class is 
the consequent part of the rule. For example, a fuzzy rule 
can be expressed as: 

R:ll ai is (/^ or ) and as is THEN low-risL (2) 

It must be noted that any subset of the complete set of 
patient features, with any combination of linguistic labels 
related to the operators and and or, can take part in the 
rule antecedent. For this kind of fuzzy rule, we say that a 
patient case o supports the antecedent part of a rule r if 

1 

APC{a,r) = - V'max{/x«.,i(a((ajO)>- • • ^ f^ai,m(cr(cii))} > 0 
1=1 

(3) 

where fiaiji^^i^i)) is the membership degree of patient 
feature ai for a to the fuzzy set corresponding to the lin- 
guistic label lai for af, and APC is the antecedent part 
compatibility between a patient case and the antecedent 
part of a fuzzy rule. For the categorical features, the 
degrees of membership are zero or one. 

For a patient case a, the support degree of cr by a specific 
rule r is calculated as follows: 
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Supp(<j, r) 



IAPC(cr, r) the actual risk level of a is equal to the class of r 
0 otherwise 



(4) 



In general, a fuzzy rule can be considered to be a 
classification rule if the antecedent contains fuzzy item 
sets, and the consequent part contains only one class 
label such as low-risk, medium-risk, or high-risk in this 
study. A fuzzy rule r : Cond Class could be mea- 
sured directly in terms of support and confidence as 
follows: 



Support(r) 



Confidence{r) 



EaeD^PCia^r) 



(5) 



(6) 



Method 

In this section, we describe the process of utilizing 
GFS to develop an intelligent system for the problem 
of UA risk assessment. As shown in Figure 1, the pro- 
posed method consists of three steps. At first, all the 
numerical patient features of the data-set are given as 
input for the fuzzy clustering module for calculating the 
membership functions. Then, calculated function val- 
ues are given to the rule generation module for obtain- 
ing UA risk assessment rules. Based on the derived 
rules, a classification model for UA risk assessment is 
generated. 

The case study was performed in the Cardiology 
Department at the Chinese PLA General hospital. Prior 
approval was obtained from the data protection commit- 
tee of the hospital to conduct the study. We state that 
the patient data was anonymized in this study and in the 
Method section of this paper. 

Fuzzy clustering for numerical feature discretization 

One of the most important steps in UA risk assessment is 
to deal with the intrinsic uncertainty about the variables 



utilized. As described in [13], fuzzy set is a common 
tool for facilitating the interpretation of rules in linguistic 
terms, and avoiding unnatural boundaries in the parti- 
tioning of the variable domains. It is especially useful in 
clinical settings where the boundaries of a piece of infor- 
mation used may not be clearly defined. Regarding our 
task of UA risk assessment, the quality of the results 
produced relies quite crucially on the appropriateness of 
fuzzy sets to the given patient features. So, fuzzy sets 
must be consistent with the values of the corresponding 
feature. 

Fuzzy sets can be provided by physicians. However, the 
provided fuzzy sets by physicians may not be suitable 
for mining fuzzy association rules from data-set. Also, it 
is extremely difficult for physicians to estimate the most 
appropriate fuzzy sets. In order to cope with these prob- 
lems, we first concentrate on how fuzzy sets of the given 
features are determined automatically from the collected 
data-set. Clustering techniques are usually employed as 
a preprocessing step to partition numerical features [14]. 
In this study, we employed a hierarchical agglomera- 
tive clustering [15-17] algorithm to partition numerical 
features. 

As shown in algorithm 1, hierarchical agglomerative 
clustering begins with each value as a separate cluster and 
merges them into successively larger clusters. The pro- 
cess is repeated until the similarity between any pair of 
clusters is less than a threshold value s. Consequently, the 
algorithm builds a structure called dendogram, i.e., a tree 
illustrating the merging process and intermediate clusters. 
Similarity between two clusters ci and C2 can be measured 
as follows: 



sim(ci, C2) 



\Cl\\C2 



xeci yec2 



■y\ 



(7) 



where \ci\ and \C2\ are the number of clusters ci and C2, 
respectively. 



Data set 



0 

Numerical 
feature fuzzy 
clustering 



Membership 
function values 



0- 



Rule 
Generation 



UA risk 
assessment rules 



Figure 1 The main steps of the proposed UA risk assessment model. 
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Algorithm 1 Hierarchical Agglomerative Clustering 

1: Input: 

2: Da is a value set of a specific patient feature a 
3: (5 is a cluster similarity measure 
4: e is a merging threshold value 

5: Output: 

6: C is the set of clusters 

7: Steps: 

8: Let C = {initial clusters} where each value in Da 

forms an initial cluster 
9: Repeat 

10: Let (ci, C2) be a pair of clusters which are most similar 
in C 

11: lisim{c\,C2) > s then 

12: Let C3 ^ ci U C2 

13: Add C3 into C 

14: Remove ci and C2 from C 

15: End If 

16: Until cannot find ci and C2 with sim(ci, C2) > £ 

17: End Procedure 



This way, the values of each patient feature in the data- 
set are distributed over a set of derived clusters using 
Algorithm 1. For each patient feature, the centroids of 
the clusters are the set of midpoints of the fuzzy sets. 
To illustrate the process, suppose we want to find fuzzy 
sets for a specific patient feature a, which is quantita- 
tive with a range from mm{dom(a)) to m2ix(dom(a)) , Let 
{v^, • • • , v^} be the set of mid-points of the fuzzy sets for 



a. As a result, the derived fuzzy sets will have the following 
ranges: [v«, v^] , [v«, vj],---, [v-"!, ^+^1 and [v™, 
where = min(dom(a)), and v^+^ = m2ix(dom(a)) , 

After the fuzzy sets of each numerical feature are 
obtained, the corresponding membership function can 
be generated for each fuzzy set. In this study, we used 
membership functions of both semi-trapezoidal shape 
and triangular shape because they are in general the most 
appropriate shapes and the most widely used in fuzzy sys- 
tems. For example, for the fuzzy set with a range from 
to v^, the membership function is given by 



if < ^ < vi 
0 if X > vt 



(8) 



For each fuzzy set with midpoint v4, where 1 < ; < m, the 
membership function is given by 



ifvL-^<^<vL 



0 



if < ^ < v4 
otherwise 



'+1 



(9) 



And for the fuzzy set with a range from to v^+^, the 
membership function is given by 



M<2,m+l(>^) — 



if; 



,,ra+l 



0 



,,m+l 



(10) 



a X < 



For example, given a numerical feature, say age with three 
different ranges, i.e., [30, 56], [56, 74], and [74, 87]. The 
values of age range from 30 to 87, and can be classified 
into four fuzzy sets, as shown in Figure 2. 

Fuzzy association rule mining 

This subsection presents a GFS for mining fuzzy associa- 
tion rules from a data-set. The proposed system uses fuzzy 
rule format defined in Equation (2), which offers a flexi- 
ble structure to the rules, allowing each patient feature to 
take more than one value and facilitating the extraction of 
general UA risk assessment rules. 

Chromosome representation 

As mentioned above, a US risk assessment rule r consists 
of an antecedent part Cond and a consequent part Class, 
In this study, we code the antecedent part Cond of r as one 
chromosome consisting of a set of segments. Each seg- 
ment corresponds to a specific patient feature. The set of 
possible values for the categorical features is that indicated 
by the problem, and for numerical features, it is the set of 
linguistic terms determined by the clustering method pre- 
sented above. The consequent part Class of r is prefixed 
to one of the possible values of risk levels, i.e., high-risk, 
medium-risk, and low-risk. 

Table 2 describes a representation for a rule with numer- 
ical and categorical features for the values of a specific risk 
level high-risk. Note that a bit for each one of the possible 
values of each feature is stored. In this way, if the value of 
the corresponding element is 0, it indicates that the value 
is not used in the rule. Otherwise, if the value is 1, it indi- 
cates that the corresponding value is included. If a rule 
contains all the elements corresponding to a feature of the 
value 1, or all of them contain the value 0, this indicates 



IVIiddle-aged 



Very Old 




56 74 87 

age (year) 

Figure 2 The membership functions for patient feature, age. 
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Table 2 Representation of a fuzzy rule with numerical and categorical features in UA risk assessment 



Patient feature 




Age 




Sex 


Smoking 


Has heart 
recently 


events 




Linguistic label 


Young 


Middle-aged Old 


Very old 


Male Female 


Yes No 


Yes 


No 




Rule 


0 


0 1 


1 


0 0 


1 0 


1 


0 


High-risk 



that this feature has no relevance for the information con- 
tributed in the rule, and so this feature is ignored. In these 
cases, the feature does not take part in the rule. For exam- 
ple, as shown in Table 2, the rule is represented by a binary 
string ((0011)(00)(10)(10) : high-risk), where parenthe- 
ses are to separate segments, and is to separate the IF 
part and the THEN part of the rule. In this example, ai 
has four possible values and a^y and have two pos- 
sible values. Note that a2 does not take part in the rule as 
a2 takes none of its values, and thus a2 is irrelevant for the 
rule. This binary string can be interpreted as the follow- 
ing rule: "IF age is (old or very old) and smoking is true and 
has heart events recently is true THEN UA risk is high". 

Fitness function and selection process 

The objective of this step is to find the accurate and gen- 
eral rules for UA risk assessment. Thus, given a specific 
rule r, the GA method uses the composite fitness function 
consisting of support and confidence in the following way: 

Supportir) + Confidenceir) . , 

Fitness(r) = (11) 

The objective of the fitness function is defined as the com- 
posite measure of support and confidence. This composite 
measurement provides an effective selection environment 
which balances the accuracy and generality of the rules. 

Three operators, i.e., selection, crossover, and mutation, 
are applied in the proposed GA method to generate the 
offspring population, which are illustrated as follows: 

• The selection procedure is used for evolution where 
two individuals are selected randomly from the 
current population and used for crossover and 
mutation operators. During each generation, 
individuals with higher fitness values survive while 
those with lower fitness values are destroyed. 

• Two parents are selected and recombined according 
to the predefined crossover probability during 
crossover. In this work, one point crossover is applied 
due to its simplicity, which can randomly select 
different cutoff points for each parent to generate 
offspring rule sets. 

• Each element in the chromosome is applied to 
mutation with a predefined mutation probability. The 
value of a randomly selected element is converted to 0 
if its value is 1, and vice versa. Elimination of existing 
rules and addition of new rules can also be used as 



mutation operations. As a result, the number of rules 
in the rule sets string can be changed accordingly. 

During GA operations, redundant rules might be pro- 
duced. For example, we say a rule "If Age is (Young) and 
ST is (Low) then Risk is (Low)" is redundant w.r.t the other 
rule "If Age is (Young OR Meddle-ages) and ST is (Low) 
then Risk is (Low)", if both rules have the same support 
degree on a data-set. Thus, the proposed algorithm must 
check the rule sets and maintains single among all the 
rules, to guarantee the consistency of fuzzy systems. The 
stopping criterion of the proposed algorithm is the num- 
ber of generations. The scheme of the proposed algorithm 
is shown in Algorithm 2. 



Algorithm 2 The GA-based rule mining algorithm for UA 

risk assessment. 

1: Input: 

2: UA data-set, algorithm parameters 
3: Output: 

4: dl is the set of derived UA risk assessment rules 
5: Steps: 

6: For each target class c of UA risk levels, apply the 

following steps 
7: Step 1: Initialize Pc as a set of UA risk assessment 

rules 

8: Step 2: Fitness evaluation for each individual r e 

Pc 

9: Step 3: While not (termination condition) do 
10: Step 3.1: Select r^^ and rp^ from Pc with high 
fitness value 

11: Step 3.2: Apply crossover to r^^ and 

obtaining and rc2 
12: Step 3.3: Apply mutation to and rc2 
13: Step 3.4: Check if and rc2 have been stored 

in Pcf if not 

14: Step 3.4.1: Update Pc by deleting the worst 

solution and adding r^^ and rc2 
15: Step 4: Add Pc to rule pool di 
16: Output selected rule set from 
17: End Procedure 



Taking the data set shown in Table 1 as an example, and 
assuming the population size as 50, the number of gener- 
ation as 1000, the crossover rate as 0.5, and the mutation 



Dong et al. BMC Medical Informatics and Decision Making 201 4, 1 4:1 2 
http://www.biomedcentral.eom/1 472-6947/1 4/1 2 



Page 6 of 10 



Table 3 Rules derived from the example patient data-set shown in Table 1 



# Rule 


Rule 


Support 


Confidence 


Fitness 


1 


IF Age ( Young ) AND Heart Events Recently is ( False ) THEN Risk is Low 


0.083 


0.5 


0.291 


2 


IF Age ( Young ) AND Sex is ( Female ) AND Smoke is ( False ) THEN Risk is 
Low 


0.11 


0.33 


0.22 


3 


IF Age is ( Young OR Meddle-aged ) AND Smoke is ( False ) AND Heart Events 
Recently is ( True ) THEN Risk is Medium 


0.244 


0.514 


0.378 


4 


IF Age is ( Old ) AND Smoke is ( False ) THEN Risk is Medium 


0.339 


0.465 


0.402 


5 


IF Age is ( Old OR Very-Old ) AND Smoke is ( True ) THEN Risk is High 


0.269 


0.377 


0.323 


6 


IF Age is ( Meddle-aged OR Old OR Very-Old ) AND Sex is ( Male ) AND Heart 
Events Recently is (True ) THEN Risk is High 


0.324 


0.455 


0.390 



rate as 0.2 in the proposed genetic algorithm, we can 
obtain an example rule-set, as shown in Table 3. 

UA risk assessment model 

Based on the derived rule set, we can generate a clas- 
sification model for UA risk assessment. Formally, let 9^ 
be the set of derived UA risk assessment rules. For each 
r : Cond Class (r € 9^1), the score value of the target 
class {class e {low-risky medium-risk, high-risk}) of r with 
respect to a given patient case a can be assessed, by using 
the following equation: 

"^class = X! Confidence(r)p (r, a) (12) 

rem 

where Confidence(r) is the confidence value of rule r, and 
y6(r, a) is the firing strength of the input patient case a on 
the antecedent part of rule r. 
The firing strength Pir^a) is defined as 

^(r,or) = ^ max{/x«,i (or (^)), • • • , /Xfl,m(o^(^/))} 

{(a,ll),-,(a,C))eCond 

(13) 

where represents the fuzzy membership function 
for the pair of a and the linguistic term la in the 
{{aj\), • • • , {a,V^)){l < j < m) in the antecedent part 
Cond of rule r. 

With respect to the target values of risk levels, i.e., 
low-risk medium-risk, high-risk, the corresponding scores 
V/, Vyny Vh can be generated based on Equation (12). And 
the risk level with the top score in the scoring vector v will 
be the predicted risk level for the patient case a. 

Classa = argmax{{vi,Vm,vyi]) (14) 

Taking gi shown in Table 1 as an example, and using the 
derived rule-set shown in Table 3, the score values for cri 
are calculated as v/ = 0.4, Vm = 2.088, and v/^ = 1.745, 
by Equation (12). Thus, the predicted risk level for cri is 
medium-risk. 



Table 4 Patient features utilized in UA risk assessment 

Numerical feature 



Name Fuzzy sets 



Age 


{Young, middle-aged, old. 




very old} 


Systolic Blood Pressure (SBP) 


{Low, medium, high} 


Diastolic Blood Pressure (DBP) 


{Low, low-medium, high- 




medium, high} 


Creatinine 


{Low, medium, high} 


AST 


{Low, medium, high} 


LDH 


{Low, medium, high} 


CK 


{Low, medium, high} 


CKMB 


{Very low, low, medium, high} 


CnT 


{Low, high} 


Categorical feature 


Name 


Crisp sets 


Sex 


{Male, female} 


Smoke 


{Yes, no} 


Atrial premature beat 


{Yes, no} 


Significant change in ST in ECG 


{Yes, no} 


Has heart events recently 


{Yes, no} 


Anamnesis of coronary heart disease 


{Yes, no} 


Anamnesis of renal insufficiency 


{Yes, no} 


Anamnesis of bleeding 


{Yes, no} 


Anamnesis of diabetes 


{None, type 1, type 2} 


Anamnesis of hyperlipaemia 


{Yes, no} 


Anamnesis of hypertension 


{None, level 1, level II, level III} 


Has aspirin in recent 7 days 


{Yes, no} 


Has percutaneous coronary intervention 


{Yes, no} 


Lungs event as expectoration with blood 


{Yes, no} 


Change in T 


{Yes, No} 


Heart event by drinking 


{Yes, no} 


Irregular pulse 


{Yes, no} 
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Results and discussion 

To evaluate the feasibility of the presented methods, a 
clinical case study is conducted through the cooperation 
with the Cardiology Department of the Chinese PLA Gen- 
eral Hospital The data-set collected from the hospital 
consists of 54 patient cases. The target classes of UA risk 
levels include: low-risk, medium-risk, and high-risk. Physi- 
cians that evaluated these cases are experienced clinicians 
working for the hospital with 10 years of working experi- 
ence on average. As a result, 16 cases are classified into the 
low-risk group, 33 cases are classified into the medium- 
risk group, and 5 cases are classified into the high-risk 
group, respectively. Patient features (9 numerical features 
and 17 categorical features) that appear in the data-set are 
shown in Table 4. These features are regularly recorded in 
UA treatment practice. 

All experiments were performed on a Lenevo Com- 
patible PC with an Intel Pentium IV CPU 2.8 GHz, 4G 
byte main memory running on Microsoft Windows 7. 
The algorithms were implemented using Microsoft C#. 
A 10-fold cross-validation was performed to evaluate the 
proposed method, by using a 90% of the data-set as the 
training set, and the remaining 10% as the validation set. 



To reduce variability, 10 rounds of this validation process 
were performed by using different partitions. 

For the first step of patient feature discretization, we 
applied Hierarchical Agglomerative Clustering method to 
each numerical feature to generate a set of fuzzy sets. The 
derived fuzzy sets of input numerical patient features are 
shown in Figure 3. 

To mine fuzzy association rules, we have taken the pop- 
ulation size as 100, the number of generation as 1000, the 
crossover rate as 1.0, and the mutation rate as 0.2 in the 
proposed genetic algorithm. By using these parameters, 
we run our genetic algorithm for each target class of UA 
risk level, one by one, to obtain a set of fuzzy rules for a 
given class. 

Table 5 shows the rules obtained, which have the best 
fitness values for the target classes of UA risk level (i.e., 
low-risk, medium-risk, and high-risk). In this table, the 
number of patient features involved in each rule (# of 
Feature.), and the Support and Confidence of each rule 
are shown. The values of Support and Confidence are 
between zero and one. High values in support means that 
the rule covers most of patient cases which are categorized 
into the class, and high values in confidence means that 
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Figure 3 Membership function of input numerical patient features. 
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Table 5 Results for low-risk, medium-risk, and high-risk 



Risk level 


# of feature 


Support 


Confidence 


Low 


9 
8 


0.204 
0.195 


0.375 
0.380 


Medium 


10 


0.448 


0.633 




10 


0.449 


0.631 




10 


0.067 


0.114 


High 


9 


0.064 


0.115 



the rule has few negative patient cases [12]. Note that the 
knowledge discovered for each target value of risk level is 
understandable by physicians due to the use of fuzzy logic 
and the low number of rules and conditions in the rule 
antecedents (below 40% of 26 patient features). Tables 6, 7 
and 8 show the rules obtained which have the best fitness 
values corresponding to the target classes of risk level. 

Now, acquiring fuzzy rule base, it is possible to complete 
UA risk assessment through the proposed classification 
model. As we mentioned above, the ensemble of fuzzy 
rules and the proposed classification model perform the 
role of a mathematical function to obtain the system out- 
put. This output is the stratification of UA risk. This way, 
for each patient whose data are informed as system inputs, 
the most likely risk level for that patient is generated. 

The comparison between the proposed model and 
physicians' decisions is done for each partition of data- 
set. The comparison is given in Figure 4. The proposed 
system made the same decisions as the physician in 46 
(out of a total of 54) tested cases (85.2%). From Figure 4, 
the proposed model did not predict the risk well in the 
case of the fifth (the physician assessment is medium- 
risk while the proposed model assessment is low-risk), 
the twenty third (the physician assessment is low-risk 
while the proposed model assessment is medium-risk), 
the twenty ninth (the physician assessment is medium- 
risk while the proposed model assessment is low-risk), the 
thirty third (the physician assessment is low-risk while the 
proposed model assessment is medium-risk), the thirty 
fourth (the physician assessment is medium-risk while the 

Table 6 Rules for low-risk 
# Rule Rule 

1 IF Age is ( Young OR Meddle-aged ) AND AST is ( Low OR 
Medium ) AND Creatinine is ( Low OR Medium ) AND CK is 
( Low OR Medium ) AND LDH is ( Low OR Medium ) AND 
CK MB is ( Medium ) AND has PCI is ( No ) AND Hypertension 
is ( No OR Level I OR Level II ) AND Arrhythmia is ( No ) THEN 
Risk is Low 

2 IF Age is ( Young OR Meddle-aged ) AND AST is ( Low OR 
Medium ) AND Creatinine is ( Low OR Medium ) AND CK is 
( Low OR Medium ) AND LDH is ( Low OR Medium ) AND 
CK MB is ( Medium ) AND has PCI is ( No ) AND Hypertension 
is ( No OR Level I OR Level II ) THEN Risk is Low 



Table 7 Rules for medium-risk 
# Rule Rule 



1 IF Age is ( Old ) AND Creatinine is ( Low OR Medium ) AND CK is 
( Low OR Medium ) AND LDH is ( Medium ) AND SBP is ( Medium 
OR High ) AND Heart Events Recently is ( Yes ) AND Aspirin in 
Recently 7 days is ( No ) AND has PCI is ( Yes ) AND Hypertension 
is ( Level I OR Level II OR Level III ) AND Hyperlipaemia is ( No ) 
THEN Risk is Medium 



2 IF Age is ( Middle-aged OR Old ) AND Creatinine is ( Low OR 
High ) AND CK is ( Low OR High ) AND LDH is ( Medium ) AND 
SBP is ( Medium OR High ) AND Heart Events Recently is ( Yes ) 
AND Aspirin in Recently 7 days is ( No ) AND has PCI is ( Yes ) 
AND Hypertension is ( Level I OR Level II OR Level III ) AND 
Hyperlipaemia is ( No ) THEN Risk is Medium 



proposed model assessment is low-risk), the forty (the 
physician assessment is medium-risk while the proposed 
model assessment is low-risk), the forty second (the physi- 
cian assessment is medium-risk while the proposed model 
assessment is low-risk), and the forty seventh (the physi- 
cian assessment is medium-risk while the proposed model 
assessment is low-risk). 

Furthermore, we measure the accuracy of the proposed 
approach using the sum of two performance measures: 
sensitivity (probability that the test correctly classifies a 
case with a specific risk level) and specificity (probability 
of correctly classifying a case without a specific risk level). 

ELi TP(cji) 

(15) 



Speciricity(Vc//7cc) = ^^t^ 

(16) 

where Vdass ^ {low-risk, medium-risk, high-risk}; TP is the 
set True Positive, patient cases with the specific risk level 
Vciass classified correctly; FN is the set False Negative, 
patient cases with the specific risk level Vdass classified as 
other risk levels; TN is the set True Negative, patient cases 
without the specific risk level Vdass classified; FP is the 

Table 8 Rules for high-risk 
# Rule Rule 



1 IF Age is ( Old OR Very-Old ) AND AST is ( High ) AND Creatinine 
is ( High ) AND CK is ( Medium ) AND LDH is ( Medium OR High ) 
AND CK MB is ( High ) AND SBP is ( Low OR Medium ) AND 
Smoke is ( Yes ) AND Hypertension is ( Level II OR Level III ) AND 
Diabetes is ( Type 1 OR Type 2 ) THEN Risk is High 

2 IF Age is ( Old OR Very-Old ) AND AST is ( High ) AND Creatinine 
is ( High ) AND CK is ( Medium ) AND LDH is ( Medium OR High ) 
AND CK MB is ( High ) AND SBP is ( Low OR Medium ) AND 
Hypertension is ( Level II OR Level III ) AND Diabetes is (Type 1 
OR Type 2) THEN Risk is High 
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■ physicians' decisions 




Patient case number 
Figure 4 Comparison between the proposed model and physicians' decisions. 



set False Positive, patient cases without the specific risk 
level V class classified as Vdass- Table 9 the sensitivity and 
specificity obtained for each risk level are presented. The 
experimental results indicate that the proposed method 
is feasible for predicting risk levels of unstable angina 
patients. 

Conclusion 

In this paper, we have presented an intelligent sys- 
tem for UA risk assessment by combining genetic 
algorithm and fuzzy association rule mining. The devel- 
oped approach has been tested on a data-set consisting of 
54 UA patient cases from the Cardiology department of 
Chinese PLA General hospital. The experimental results 
show that considerable agreement is achieved between 
the proposed approach and physicians' problem solving 
knowledge. 

The main novelty of the developed model is that it 
represents a valuable objective tool for UA risk assess- 
ment. In medical literature, physicians are in discrepan- 
cies about the risk factors highlighted. This research has 
focused on the application of computational intelligence. 
In particular, a genetic-fuzzy system, to identify the key 
factors behind UA, is proposed, which could be used for 
educational purposes, and with further improvements, 
could assist and guide young physicians in their daily 
work. 

For future studies, there may be a comparison of effec- 
tiveness in terms of the proposed system with traditional 
UA risk assessment models, such as TIMI, GRACE, etc. 
The application of the proposed system to other kinds 
of CVD, such as heart failure, will also be investigated. 



Table 9 Sensitivity and specificity with different risk levels 





Sensitivity 


Specificity 


Low-risk 


0.8125 


0.8421 


Medium-risk 


0.8182 


0.85 


Higli-risk 


1.0 


1.0 



Furthermore, other computational intelligence techniques 
can be associated with the developed system. 
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